Theory of Bubble Nucleation and Cooperativity in DNA Melting 
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The onset of intermediate states (denaturation bubbles) and their role during the melting tran- 
sition of DNA are studied using the Peyrard-Bishop-Daxuois model by Monte Carlo simulations 
with no adjustable parameters. Comparison is made with previously published experimental results 
finding excellent agreement. Melting curves, critical DNA segment length for stability of bubbles 
and the possibility of a two states transition are studied. 
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Accessing the genetic code stored in DNA is central 
to fundamental biological processes such as replication 
and transcription and this requires that the extraordi- 
nary stable double helical structure of the molecule must 
locally open to physically expose the bases. Although, in 
the cell, proteins may actively help separating the strands 
of double stranded DNA, recent evidence 0, corrob- 
orates that sequence-specific propensity to form strand 
separations (bubbles) at transcription initiation sites ex- 
ists and promotes thermal bubble formation. Impor- 
tant thermal effects such as stability of different DNA 
sequences, and the properties of denaturation bubbles 
can be studied in vitro and provide important insight 
to the biological processes. Recent, experimental stud- 
ies [1 EH3] have attempted to interrogate the nature and 
statistical significance of such bubble states. Intriguingly, 
these experiments combine traditional UV absorption ex- 
periments with a novel bubble quenching technique that 
traps ensembles of bubbles to capture statistical proper- 
ties of the bubbles. 

The actual melting of double-stranded DNA occurs 
through an entropy driven phase transition. The en- 
tropy gained in transitioning from the very rigid double- 
stranded DNA to the much more flexible single-stranded 
DNA can, already at moderate temperatures, balance 
the energy cost of breaking a base-pair. Since, the 
double-stranded helix is held together by hydrogen bonds 
between complementary base-pairs: two bonds for the 
AT pair and three bonds for the stronger GC pair, the 
sequence heterogeneity interplays with the entropy ef- 
fects to create an extended premelting temperature win- 
dow, (including the biologically relevant regime) where 
large thermal bubbles are readily formed. Theoreti- 
cal studies of the melting transition have included ones 
based on Ising-type models describing paired and 
unpaired bases, thermodynamics models like nearest- 
neighbor models Q, Poland-Scheraga models sim- 
ple zipper models 0, [ljj, or models that introduces 



a phenomelogical pairing potential between the bases 
[111 ll2L |l3|. In particular the Peyrard-Bishop-Dauxois 
model njj [Rj is emerging as a model that is able to ap- 
propriately describe the melting transition but also the 
sequence dependence of the bubble nucleation dynamics 
in the pre-melting regime. 

Here, we compare the powerful recent experimental re- 
sults in Refs. 0, Q with Monte Carlo simulations of 
the model proposed by Peyrard, Bishop, and Dauxois 
[ill H^ . Il3| . This model has already been successfully 
compared with denaturation experiments on short ho- 
mogeneous sequences 0. The recent demonstration pj 
of the model's ability to accurately predict the locations 
at which large bubbles form in several viral sequences, 
is even more exceptional. The difference between our 
comparison and previous ones is that we use the same 
(deceptively) simple model, with no further refinements 
that introduce new parameters that need to be fitted. In- 
deed parameters of the model are not changed to fit the 
experiments: we use the same values for those parame- 
ters that were fixed in reference |l4| for quite different 
DNA sequences. 

The potential energy of the model reads: 



V = 



(1) 



The sum is over all the base-pairs of the molecule and 
y n denotes the relative displacement from equilibrium at 
the n th base pair. The first term of the potential en- 
ergy is a Morse potential that represents the hydrogen 
bonds between the bases. The second term is a next- 
neighbor coupling that represents the stacking interac- 
tion between adjacent base pairs: it comprises a har- 
monic coupling multiplied by a term that strengthens 
the coupling when the molecule is closed and makes it 
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weaker when it is melted, in this way taking into account 
the different stiffness (i.e. entropy effects) of DNA dou- 
ble strands and single strands (this effect can be directly 
observed, in model calculations, in terms of a softning 
of the characteristic frequencies of the system with ris- 
ing temperature ^5|). This nonlinear coupling results in 
long-range cooperative effects in the denaturation, lead- 
ing to an abrupt entropy-driven transition [T2I A 
crucial point for obtaining correct results is the accurate 
description of the heterogeneity of the sequence 0|. In 
this model it is incorporated by giving different values to 
the parameters of the Morse potential, depending on the 
base-pair type of the site considered: adenine-thymine 
(AT) or guanine-cytosine (GC). The parameter values we 
have used are those used in Ref. [lj|: k = 0.025eV/A 2 , 
p = 2, (3 = 0.35A -1 for the inter-site coupling, while 
for the Morse potential D GC = 0.075eV, a GC = 
for a GC base pair, and Dat = O.OSey, ciat — 4.2A -1 
for an AT pair. These parameters were chosen to fit 
thermodynamic properties of DNA [J^j. One should be 
cautious in relating these parameters directly to micro- 
scopic properties, and recall that they arise as a result of 
several physical phenomena at the microscopic level. 

Using the standard Metropolis algorithm 0, 0] , we 
have performed Monte Carlo simulations on this model 
|l8j . For each temperature, we performed a number of 
simulations. In each of these simulations, we compute 
the mean profile (y n ), from which we obtain the fraction 
of open base-pairs. We consider the n'th base-pair to be 
open if (y n ) exceeds a certain threshold. Applying the 
same threshold, we record at the end of each simulation 
whether the entire molecule was open (denaturated). 
Performing a large number of such simulations starting 
from different initial conditions we obtain the averaged 
fraction / of open base-pairs and the averaged fraction of 
denaturated molecules p at a given temperature. In this 
way, we "simulate" the experiments, where the measures 
are made over a large ensemble of molecules. The 
threshold we have used is 0.5A, but we have used other 
values and observed that the faction p of denaturated 
molecules depends only very slightly on the threshold 
value. The fraction of open base pairs, /, displays a 
somewhat stronger dependence on the threshold value. 
In the same manner as Ref. we obtain the averaged 
fractional length of the bubbles as I = (/ — p)/(l — p). 
The experimental work 0, Q concentrated on two sets 
of sequences one set (bubble-in-the-middle sequences) 
designed to form bubbles in the middle of the short 
sequence, and another set (bubble- at-the- end sequences) 
designed to form bubbles (openings) at one end of the 
sequences. Specifically, these sequences are: 
(a) Bubble-in-the-middle sequences : 
L60B36: CCGCCAGCGGCGTTATTACATTTAATTC 
TTAAGTATTATAAGTAATATGGCCGCTGCGCC 
L42B18: CCGCCAGCGGCGTTAATACTTAAGTATT 
ATGGCCGCTGCGCC 



L33B9: CCGCCAGCGGCCTTTACTAAAGGCCGCT 
GCGCC 

(b) Bubble-at-the-end sequences: 

L48AS: CATAATACTTTATATTTAATTGGCGGCGC 
ACGGGACCCGTGCGCCGCC 

L36AS: CATAATACTTTATATTGCCGCGCACGCGT 
GCGCGGC 

L30AS: ATAAAATACTTATTGCCGCACGCGTGC 
GGC 

L24AS: ATAATAAAATTGCCCGGTCCGGGC 
L19AS.2: ATAATAAAGGCGGTCCGCC 
The bubble-in-the-middle sequences are rich in AT 
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FIG. 1: Melting profiles for the bubble-in-the-middle se- 
quences 0, ' 4, 5]. Filled circles are p, open circles are / and 
squares are I. 
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FIG. 2: Upper figure: a — f—p versus T for L60B36 (circles), 
L42B18 (squares) and L33B9 (diamonds) H H 0. Lower 
figure: a av versus the length, L, of the molecule. 
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FIG. 3: Melting profile for the L48AS and L19AS.2 se- 
quences. Symbols are as in Fig. 



base-pairs in the middle, while the bubble-at-the-end 
are rich in AT base-pairs at one end of the molecule. 
The AT base pairs are bonded by two hydrogen bonds, 
as opposed to the stronger triple hydrogen bonding of 
the GC base-pairs. This fact is obviously reflected in 
the model parameters (Dat = 0.05 while Dec = 0.075) 
and it also indicates that AT rich regions denaturate at 
lower temperatures that GC rich regions 

In Fig. we present our results for the bubble-in-the- 
middle sequences and we see a very good agreement with 
the experimental results given in Refs. |J,|5j. As in the 
experimental results we find for the L60B36 sequence 
that / » I for I < 0.6. After this point, I displays a 
plateau, resulting from the occurrence of completely de- 
naturated molecules at T ~ 65C. As noted in the ex- 
perimental work, the plateau occurs at I ~ 0.6 because 
this is the ratio between the AT-rich central region, of 
36 base pairs, and the molecule's total of 60 base pairs. 
The fact that / « I before the plateau indicates that the 
bubble opens continuously as a function of temperature 
until it reaches its full size, while there are very few com- 
pletely melted molecules at these temperatures. For the 
L42B18 sequence we again find a plateau in I at the value 
42/18 w 0.43, but here / ^ I even at the lower temper- 



atures (this is even more pronounced for L33B9). This 
shows that bubble generation and complete denaturation 
are both possible at lower temperatures. Since the three 
sequences are similar in structure and merely differ in the 
length of AT-rich region, this demonstrates that for these 
structures bubble are only sustainable if the soft region 
is of size 20 base-pairs or more. To further illustrate this 
point we show in Fig. 0cr = / — p, which represents the 
fraction of bases participating in a bubble state at a given 
temperature. The upper figure of Fig. [21 shows, as dis- 
cussed, that as the soft bubble region becomes shorter, 
bubble states become less important, as also seen in Ref. 
4j. In the lower figure, we summarize the length depen- 
dence of the incidence of bubble states. We plot a av , the 
area under the curves in the upper figure divided by their 
width, versus the molecule length, L. The line is a lin- 
ear fit showing that these intermediate states disappear 
(o'av — 0) for L w 22, in excellent agreement with the 
experimental conclusion in Refs. 0, |J 

In Fig. [3] we show the melting curves for two of the 
bubble-in-the-end molecules. Comparison with experi- 
mental results in ref. is again good although not as 
good as in the the bubble-in-the- middle cases. This is due 
to the limitations of our model at the ends of the DNA 
molecule. Most experimental features are, however, still 
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FIG. 4: Upper figure: a = / — p versus T — T m (T m is the 
melting temperature) for L48AS (circles), L36AS (squares), 
L30AS (diamonds), L24AS (triangles up) and L19AS_2 (trian- 
gles down) 0.0-13 Lower figure: a a v versus L, the molecule's 
length. 



reproduced. For instance, in the L48AS sequence we find 
the same plateau on I at L w 0.8 that is seen in the 
experiments. To overcome the problem caused by the 
boundaries, in Fig. 01 we consider how a and cr av change 
with the system size, as the deformation imposed by the 
excessive end opening will appear in all the molecules 
and in that way will not contaminate the global picture. 
In the upper figure we plot a versus T — T m (T m is the 
melting temperature), finding that the bubble states are 
smaller for the shorter sequences, as in Ref. 0. In the 
lower figure we plot a av versus L. The extrapolation to 
(T„„ = occurs at a value compatible with L rs f , as in 
Ref. |5j , and shows that in our model a two-state transi- 
tion for this kind of sequences would only be possible in 
the limit L » 1, just as in the experiments. 

We have shown that the theoretical model proposed 
by Peyrard, Bishop, and Dauxois with no further pa- 
rameters or fitting, accurately reproduces experiments 
on DNA denaturation, not only for the melting curve, 
but also for the formation and role of bubble states in 



the premelting regime. Experimental observations re- 
garding nucleation size of the bubbles in the middle of a 
molecule and the possibility of a two states transition are 
exactly recovered by the model. This demonstrates that 
this model not only works for very large DNA strands 
0, but also for short strands such as the ones studied 
here. Remarkably, these include both natural and syn- 
thetic structures. 
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